Fertility-based Source-Language-biased Inversion Transduction Grammar for Word Alignment

نویسندگان

  • Chung-Chi Huang
  • Jason S. Chang
چکیده

We propose a version of Inversion Transduction Grammar (ITG) model with IBM-style notation of fertility to improve word-alignment performance. In our approach, binary context-free grammar rules of the source language, accompanied by orientation preferences of the target language and fertilities of words, are leveraged to construct a syntax-based statistical translation model. Our model, inherently possessing the characteristics of ITG restrictions and allowing for many consecutive words aligned to one and vice-versa, outperforms the Bracketing Transduction Grammar (BTG) model and GIZA++, a state-of-the-art word aligner, not only in alignment error rate (23% and 14% error reduction) but also in consistent phrase error rate (13% and 9% error reduction). Better performance in these two evaluation metrics suggests that, based on our word alignment result, more accurate phrase pairs may be acquired, leading to better machine translation quality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Word Alignment Based on Extended Inversion Transduction Grammar

We propose a fusion of Inversion Transduction Grammar model with IBM-style notation of fertility to improve wordaligning performance. In our approach, binary context-free grammar rules on the source language, accompanied with orientation preferences on the target, and fertilities of words are leveraged to construct a syntax-based statistical translation model. Our model, inherently possessing t...

متن کامل

A Systematic Comparison between Inversion Transduction Grammar and Linear Transduction Grammar for Word Alignment

We present two contributions to grammar driven translation. First, since both Inversion Transduction Grammar and Linear Inversion Transduction Grammars have been shown to produce better alignments then the standard word alignment tool, we investigate how the trade-off between speed and end-to-end translation quality extends to the choice of grammar formalism. Second, we prove that Linear Transd...

متن کامل

Improving Phrase-Based Translation via Word Alignments from Stochastic Inversion Transduction Grammars

We argue that learning word alignments through a compositionally-structured, joint process yields higher phrase-based translation accuracy than the conventional heuristic of intersecting conditional models. Flawed word alignments can lead to flawed phrase translations that damage translation accuracy. Yet the IBM word alignments usually used today are known to be flawed, in large part because I...

متن کامل

An Algorithm for Simultaneously Bracketing Parallel Texts by Aligning Words

We describe a grammarless method for simultaneously bracketing both halves of a parallel text and giving word alignments, assuming only a translation lexicon for the language pair. We introduce inversion-invariant transduction grammars which serve as generative models for parallel bilingual sentences with weak order constraints. Focusing on Wansduction grammars for bracketing, we formulate a no...

متن کامل

Better Semantic Frame Based MT Evaluation via Inversion Transduction Grammars

We introduce an inversion transduction grammar based restructuring of the MEANT automatic semantic frame based MT evaluation metric, which, by leveraging ITG language biases, is able to further improve upon MEANT’s already-high correlation with human adequacy judgments. The new metric, called IMEANT, uses bracketing ITGs to biparse the reference and machine translations, but subject to obeying ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJCLCLP

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2009